Ontology-Based Geographic Data Set Integration

نویسنده

  • Henricus Theodorus Johannes Antonius Uitermark
چکیده

ion Rules Surveying Rules A Fig. 5. The Real World (or Terrain) is abstracted according to abstraction rules into a Conceptual World, and described in a Domain Ontology. What is known in the Conceptual World is acquired according to surveying rules, and depending on the application, captured in a Geo-Data Set (adapted from (van der Schans 1994) and (Winter 2000, p.420)). 2.2 Abstraction Rules and Surveying Rules Abstracting the Real World is a two-step process (Fig. 5): 1. There exist classes of real-world phenomena. There may be many classes of realworld phenomena, or terrain objects, but only terrain objects from classes, relevant for a certain discipline, which can be identified and labeled, are included as concepts, or object classes, in a domain ontology.6 Rules which govern this selection — from classes of terrain objects into classes of the domain ontology — are defined as abstraction rules.7 2. With this collection of object classes we look at the terrain: it is as if we wear a pair of glasses, where only instances of object classes of the domain ontology are passed through. From this filtered collection of terrain objects — only those 6 To be as general as possible we use the term object class as synonymous of concept. 7 A fundamental problem is excluded from the discussion here: how to talk about the Real World without a real-world ontology? This is a meta-meta activity: how to formulate rules for the formation of abstraction rules (van der Schans 1997). A Conceptual Framework for Integration 21 relevant for a certain application and included in an application ontology — are acquired or ‘captured’ into a geographic data set. Surveying rules (or, alternatively acquisition rules) are defined as rules, which govern the transformation process from the actual observed terrain objects, defined as instances from object classes in the domain ontology, into instances of geographic data set object classes, as defined in an application ontology. Surveying rules define what object classes and how object classes are represented. Consequently, surveying rules include: − inclusion rules: which instances of object classes are selected (‘capture criteria’ in Open GIS Consortium vocabulary (Open GIS Consortium Inc. 1998)) − simplification rules: how instances of object classes are simplified − aggregation rules: how instances of object classes are merged, and − representation rules: how instances of object classes are represented. 2.3 Surveying Rules and Context The production of a geographic data set is done within a context, depending on the discipline of the user. Each discipline has its own definitions of object classes, and its attributes. Definitions depend on the aggregation level used: local, regional, national, etc. Each level has different terrain objects, which may be composites at another level, depending on the type of use: analysis, planning, or design (Molenaar 1998, p.157). 8 This notion of context is broad. However, in this research the concept of context has a specific meaning. Surveying rules contain additional conditions, which are not necessarily dependent on properties of terrain objects per se, but also on the situation in the terrain, that is to say, relationships between terrain objects; for example, how far are terrain objects apart, or what kind of terrain objects are adjacent to each other? Consequently, context is determined by thematic, geometric, and topologic properties of possibly multiple terrain objects, and surveying rules are context dependent. For example, two buildings in the terrain, less than two meters apart, may be acquired and represented as one single building instance in a data set; or a terrain situation, with sidewalks between flowerbeds, may be aggregated into one single composite flowerbed instance. 2.4 The Construction of a Domain Ontology for Topographic Mapping In Section 2.1.3 and Section 2.2 it was argued that we need a domain ontology for geographic data set integration. This domain ontology should be ‘rich’ enough; that is to say should contain enough concepts for interconnecting different application ontologies. 8 A formal context is defined as a triple (O, A, I) where O and A are sets and I is a binary relation between O and A: I ⊆ O × A. Elements of O and A are respectively object classes and attributes (Wille 1992). 22 Ontology-Based Geographic Data Set Integration There is an official Dutch standard for topographic data set transfer, called GeoInformation Terrain Model (GTM) that pretends to be such a ‘vehicle’ (Ravi 1995). Let’s see if elements of the GTM Standard are suitable for the construction of a domain ontology for topographic mapping. 2.4.1 GTM Standard The subtitle of the GTM Standard puts the GTM in the position of a classification: “Terms, definitions and general rules for the classification and coding for earth related spatial objects”. According to the Foreword the ultimate goal of the GTM is a general classification for the transfer of geo-information between organizations, such as municipalities, water boards, and electricity companies. It looks for a balance between a general, global approach versus a more specific approach in the description of geo-data set objects, with a tendency to a more global approach. Furthermore, the GTM is terrain related, not map related (van der Schans 1994). A terrain related description concentrates on the terrain and its geometric and nongeometric characteristics, independent of its future map representation. In addition, the focus of the GTM is object-structured, which means that recognizable objects in the terrain serve for the demarcation of listed elements in the classification. GTM defines an object as a ‘phenomenon in the terrain that exists independently of other phenomena that can be recognized separately’. The level of detail of objects is in particular determined by the physical discernibility in the terrain (for example, building instead of dwelling). 2.4.2 GTM Standard as a Domain Ontology Section 2.1.2 offered an operational definition of an ontology. This definition contained four items. These four items are summed up for the GTM Standard: 1. The GTM Standard is a collection of concepts. 2. GTM Standard concepts are defined in natural-language terms (for example, a ‘road’ is ‘a leveled part for traffic on land’). 3. The collection of concepts in the GTM Standard is limitative. 4. The GTM Standard has structure (concepts are classified into object classes; object classes belong to groups; every object class has a fixed set of attributes, with every attribute having a domain with values). Based on the previous criteria we conclude that the GTM Standard is an ontology. In addition, the GTM Standard is related to the traditional discipline of topographic mapping and land surveying, therefore it is a domain ontology. A critical issue is that definitions of GTM concepts are given in natural-language terms. Such definitions might lead to ambiguity. For example, the previous definition of ‘road’ does not give a clue for the lateral extension of a road: is a verge part of the road? A Conceptual Framework for Integration 23 2.4.3 GTM Standard and its Usefulness for Data Set Integration In Section 2.4.2 it was demonstrated that the GTM Standard is a domain ontology. How useful is this domain ontology for the integration of topographic data sets? The GTM Standard originated within the professional circle of land surveyors. Therefore, the GTM Standard has a sufficient number of concepts for topographic data sets. These concepts are divided in object classes, with a sufficient number of attributes. When two or more topographic data sets are integrated, most of the time not all possible topographic object classes are represented. Therefore one does not need all object classes from the GTM Standard. The same reasoning applies for the collection of attributes involved: while the GTM Standard has many attributes for a single object class, the number of attributes of a single object class in a data set is usually much less. Furthermore, the GTM Standard has a global overall structure. The structure reflects the dominating view point of the Real World as a surface divided by road networks, railway networks, and water networks, with ‘otherland’ (= the rest of the Real World) in-between these networks. Road networks, railway networks, water networks, and ‘otherland’ can further be described in greater detail. Differences in data sets are caused by differences in abstraction. Therefore, our conclusion is that the GTM Standard is useful for integration, provided we are able to: 1. define subclasses, possibly to the level of data classes, to express differences in abstractions between data sets (Section 2.5.1), and 2. add structure that reflects compositions in the data sets involved (Section 2.5.2). Keeping these issues in mind brings us to the construction of reference models, where data sets get their semantic transparency. 2.5 The Construction of a Reference Model In order to integrate different geographic data sets a reference model is constructed: 1. Object classes in a reference model are a subset of object classes from a domain ontology. This subset is determined by the geographic data sets to be integrated. Object classes from this subset are refined into subclasses. This refinement is also determined by the geographic data sets to be integrated (Section 2.5.1). 2. Object classes from this subset are refined into subclasses in a taxonomy classification. More structure is added to the reference model if object classes from different application ontologies are composed of each other. Then this composition is expressed as a partonomy classification in the reference model (Section 2.5.2). 3. Relationships between reference model object classes, and application ontologies object classes, define the semantics of geographic data sets. The basic 24 Ontology-Based Geographic Data Set Integration relationship, between a reference model class and a geographic data set class, is introduced in Section 2.5.3. 4. Relationships between object classes from different application ontologies are defined in Section 2.5.4. Three types of semantic relationships are defined: semantic equivalent, semantic related, and semantic relevant. 5. Finally, attention is given to special situations in the construction of a reference model: missing object classes (Section 2.5.5 and Section 2.5.7), and object class instances, acquired in parts (Section 2.5.6). 2.5.1 Object Classes for a Reference Model Selection of object classes for a reference model depends on object classes in application ontologies. Surveying rules determine relationships between object classes from domain ontology (and, therefore reference model), and object classes from application ontologies. As was mentioned in Section 2.3, surveying rules are context dependent, that is to say dependent upon thematic, geometric and topologic properties of multiple terrain object instances. To avoid an explosion in the number of object classes in the reference model, context information is as much as possible excluded from the definition of these classes. Therefore, the approach in this research is to include in the reference model information from surveying rules to the level of data classes (Molenaar 1998). Data classes are created by making discrete the value of an attribute by choosing useful limits. For example, domain object class ‘road’ is refined into three data classes: roads with (a) tracks ≤ 2 meters wide, (b) tracks 2 to 4 meters wide, and (c) tracks > 4 meters wide. Or, a characteristic attribute is chosen, like ‘free standing annex’ versus ‘adjacent annex’. Excluding context from the reference model has the advantage, that it is easier to adapt a reference model, if we want to integrate another data set, with different context dependent surveying rules. Another advantage of controlling the number of classes is surveyability, to take in at a glance relationships between reference model and application ontologies (Artale et al 1996). However, excluding context requires consistency checking of corresponding object instances (see Section 2.9). 2.5.2 Basic Structures in a Reference Model As was mentioned before, two abstraction mechanisms are fundamental in the production of geographic data sets: − there is a generalization/specialization classification, which means that classes are grouped into a taxonomy with superclasses and subclasses (Fig. 6). − there is a composite/component classification, which means that classes are grouped into a partonomy, with composite and component classes (Fig. 7). In this research it is assumed that: A Conceptual Framework for Integration 25 1. A partonomy has a two-level composite/component structure. 2. Component classes are optional: at least one component class is a constituent to a composite class. Multiplicity of instances — 0, 1, or more — of component and composite classes depends on contents and context. 3. Component classes are non-exclusive, therefore can be shared by other composite classes. Both classifications — taxonomy and partonomy — are basic structures for reference models, and combined into a tree-like structure. In Chapter 3 this tree-like structure will be defined as a finite directed graph.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ontology approach to integration of geographical data

A key point in modern automated data processing is metadata semantics representation. Employing Semantic Web existing features ontologies is a promising option. Ontologies open a novel approach to knowledge representation. The paper presents a GIS (Geographic Information System) domain application illustrating ontological approach to data integration and data processing automation in the specif...

متن کامل

Semantic Information in Geo-Ontologies: Extraction, Comparison, and Reconciliation

A crucial issue during semantic integration of different geographic metadata sources is category comparison and reconciliation. We focus on the development of a framework for identification and resolution of semantic heterogeneity between geographic categories. The framework is divided in three processes: extraction, comparison and reconciliation. The first process performs semantic information...

متن کامل

Fuzzy Ontology-based Semantic Integration of Heterogeneous Data Sources in the Domain of Watershed Analysis

In environmental research, data integration plays an important role given the increasing availability of heterogeneous data sources for specific features, such as soil type, climate, geographic location and so on. However, some of those features have inherently imprecise relationships and the lack of a suitable semantic model can be a major obstacle to its effective integration. In this context...

متن کامل

Automatic Interpretation of UltraCam Imagery by Combination of Support Vector Machine and Knowledge-based Systems

With the development of digital sensors, an increasing number of high-resolution images are available. Interpretation of these images is not possible manually, which necessitates seeking for practical, fast and automatic solutions to solve the environmental and location-based management problems. The land cover classification using high-resolution imagery is a difficult process because of the c...

متن کامل

Towards Intensional/ Extensional Integration between Ontologies

This paper presents ongoing research in the field of extensional mappings between ontologies. Hitherto, the task of generating mapping between ontologies has been focused on the intensional level of ontologies. The term intensional level refers to the set of concepts that are included in an ontology. However, an ontology that has been created for a specific task or application needs to be popul...

متن کامل

Using Conceptual and Ontological Models for Tracking Changes of Spatio-Temporal Objects and Concepts over Time

In this paper, we present a new approach that benefits from the advantages of both conceptual and ontological models in order to provide a powerful spatio-temporal data model for geographic information systems (GIS). We base our model on the MADS conceptual model that is transformed to a corresponding ontology model, using database to ontology mapping techniques and a set of transformation rule...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999